Goto

Collaborating Authors

 action detection


VideoCapsuleNet: A Simplified Network for Action Detection

Neural Information Processing Systems

The recent advances in Deep Convolutional Neural Networks (DCNNs) have shown extremely good results for video human action classification, however, action detection is still a challenging problem. The current action detection approaches follow a complex pipeline which involves multiple tasks such as tube proposals, optical flow, and tube classification. In this work, we present a more elegant solution for action detection based on the recently developed capsule network. We propose a 3D capsule network for videos, called VideoCapsuleNet: a unified network for action detection which can jointly perform pixel-wise action segmentation along with action classification. The proposed network is a generalization of capsule network from 2D to 3D, which takes a sequence of video frames as input.









Areall FramesEqual? ActiveSparseLabelingfor VideoActionDetection

Neural Information Processing Systems

Wedemonstratethattheproposed approach performs better than random selection, outperforming all other baselines, with performance comparable tosupervised approach using merely 10%annotations.